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IN THE CLAIMS : 

Please cancel claims 3, 4, 13, 15, 16, 19, and 22-24 without prejudice. 

Please amend claims 1, 2, 5-12, 14, 17, 18, 20, 21, 25, and 26 as follows: 
1 . (Currently Amended) A method for classifying electronically posted documents, the 
method comprising: 

receiving a first document and a second document; 

generating a first metadata summary cor r esponding to for said first document and a 
second metadata summary corresponding to for the second document, wherein the first metadata 
summary includes a first summary sub-t r ee plurality of sub-trees and the second metadata 
summary includes a second sub-t ree plurality of sub-trees, and wherein a sub-t r e e each of the 
sub-trees includes a plurality of list items nodes ; 

comparing the first and second metadata summaries on a structural level by comparing 
the list items a structure of the sub-trees of the first metadata summary sub-tree with the list 
items a structure of the sub-trees of the second metadata summary sub-tr e e ; and 

identifying the first and second documents as distinct if the list items structures of the 
sub-trees of the first and second summary sub-t r ees metadata summaries are not equivalent; 

if the structures of the sub-trees of the first and second metadata summaries are 
equivalent, performing a further comparison of the first and second metadata summaries. 

wherein the further comparison of the first and second metadata summaries includes the 
sub-steps of: 

comparing the first and second metadata summaries on a textual level by 
comparing textual content from the first document that is contained in the sub-trees of the 
first metadata summary with textual content from the second document that is contained 
in the sub-trees of the second metadata summary; and 

identifying the first and second documents as distinct if the textual content within 
the sub-trees of the first and second metadata summaries are not equivalent . 
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2. (Currently Amended) The method of claim 1, wherein e ach list it e m includes at least one 
attribute having an attribute valu e , the method further comparison of the first and second 
metadata summaries further comprising includes the sub-steps of : 

before comparing the first and second metadata summaries on a textual level, comparing 
the first and second metadata summaries on an attribute level by comparing the attribute value of 
a list it e m values within the sub-trees of the first metadata summary sub-t ree with the attribute 
value of a list item values within the sub-trees of the second metadata summary sub-tree ; and 

identifying the first and second documents as distinct if the attribute values within the 
sub-trees of the first and second summary sub-tr e es metadata summaries are not equivalent. 

3-4. (Canceled) 

5. (Currently Amended) The method of claim 4 1, further comprising identifying the first 
and second documents as duplicates if the text textual content within the list it e ms sub-trees of 
the first and second summary sub-t r ees metadata summaries are equivalent. 

6. (Currently Amended) The method of claim 5, further comprising removing the second 
metadata summary if the structu r es of the first and second summary sub-tr e es documents are 
equivalent identified as duplicates . 
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7. (Currently Amended) The method of claim 1, further comprising: 
defining a first equivalence metadata table comprising: 

a first row corresponding to the first metadata summary; 

a second row corresponding to the second metadata summary; 

a first column corresponding to the first metadata summary; and 

a second column corresponding to the second metadata summary, 
wherein the process step of identifying the first and second documents as distinct if the 
list it e ms structures of the sub-trees of the first and second summary sub-tre e s metadata 
summaries are not equivalent comprises storing a zero binary value in the first row and second 
column position of the first equivalence metadata table. 

8. (Currently Amended) The method of claim 2, further comprising: 
defining a first equivalence metadata table comprising: 

a first row corresponding to the first metadata summary; 

a second row corresponding to the second metadata summary; 

a first column corresponding to the first metadata summary; and 

a second column corresponding to the second metadata summary, 
wherein the process step of identifying the first and second documents as distinct if the 
attribute values of the list items within the sub-trees of the first and second summary sub-trees 
metadata summaries are not equivalent comprises storing a zero binary value in the first row and 
second column position of the first equivalence metadata table. 
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9. (Currently Amended) The method of claim 3- 7, further com p rising : 
defining a fi r st e quivalenc e metadata table comprising: 

a fi r st row cor r es p onding to the first metadata summary; 

a second r ow co rr es p onding to the second metadata summary; 

a first column corresponding to th e fi r st metadata summary; and 

a s e cond column corr e sponding to the s e cond m e tadata summary, 
wherein the process step of identifying the first and second documents as distinct if the 
text textual content of th e list items within the sub-trees of the first and second summary sub- 
trees metadata summaries are not equivalent comprises a storing of a zero binary value in the first 
row and second column position of the first equivalence metadata table. 

10. (Currently Amended) A method for classifying electronically posted documents, the 
method comprising: 

receiving a plurality of documents; 

generating a r espective plu r ality of metadata summaries cor r esponding to summary for 
each of the plurality of received documents , wherein each of the metadata summaries includes a 
plurality of sub-trees and each of the sub-trees includes a plurality of nodes : 

grouping a first subset of the r esp e ctive p lu r ality of metadata summaries into a first 
summary group, the first summary group comprising consisting of all of the metadata summaries 
having a first mime-tvpe designation; 

selecting a first metadata summary and a second metadata summary from the first 
summary group , wherein the first metadata summary includ e s a fi r st summary sub-tr ee and th e 
second m e tadata summary includ e s a second summary sub-tr ee and wherein a sub-tree includ e s a 
plurality of list items ; 

comparing the first and second metadata summaries on a structural level by comparing 
the list items a structure of the sub-trees of the first metadata summary sub-tree with the list 
items a structure of the sub-trees of the second metadata summary sub-tree ; and 

identifying the first and second documents as distinct if the list it e ms structures of the 
sub-trees of the first and second summary sub-trees metadata summaries are not equivalent. 
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1 1 . (Currently Amended) The method of claim 10, wherein the step of grouping further 
comprises grouping a second subset of the r espectiv e metadata summaries into a second 
summary group, the second summary group comprising consisting of all of the metadata 
summaries having a second mime-type designation. 

12. (Currently Amended) A system for classifying electronically posted documents, the 
system comprising: 

a metadata parser module coupled to receive electronically posted documents, the 
metadata parser configured to output respective a metadata summari e s summary for each of the 
posted documents , wherein each respectiv e of the metadata summary summaries comprises one 
or more a plurality of sub-trees , wh ere in a sub-tree and each of the sub-trees includes a plurality 
of list items and wherein a list it e m includes at least one attribut e and at least one att r ibute value 
comprising text content nodes ; 

a summary repository coupled to receive and store the r es p ective metadata summaries; 

and 

a summary consolidator coupled to the summary repository, the summary consolidator 
configured to^ 

compare th e list it e ms of sub-trees, a first of the metadata summaries and a second 
of the metadata summaries on a structural level by comparing a structure of the sub-trees 
of the first metadata summary with a structure of the sub-trees of the second metadata 
summary; 

identify the first and second documents corresponding to the first and second 
metadata summaries as distinct if the list items structures of the sub-trees of the first and 
second metadata summaries are not equivalen t; and , and delete duplicate metadata 
summaries from the summary r epository 

if the structures of the sub-trees of the first and second metadata summaries are 
equivalent, further compare the first and second metadata summaries. 

wherein the further comparison of the first and second metadata summaries 
includes: 
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comparing the first and second metadata summaries on a textual level bv 
comparing textual content from the first document that is contained in the sub- 
trees of the first metadata summary with textual content from the second 
document that is contained in the sub-trees of the second metadata summary: and 

identifying the first and second documents corresponding to the first and 
second metadata summaries as distinct if the textual content within the sub-trees 
of the first and second metadata summaries are not equivalent . 

13. (Canceled) 

14. (Currently Amended) The system of claim -B- 12, wherein the sub-tr e e comparator is 
configured to compa re a metadata portion of th e m e tadata summary further comparison of the 
first and second metadata summaries further includes the: 

before comparing the first and second metadata summaries on a textual level, comparing 
the first and second metadata summaries on an attribute level by comparing attribute values 
within the sub-trees of the first metadata summary with attribute values within the sub-trees of 
the second metadata summary; and 

identifying the first and second documents corresponding to the first and second metadata 
summaries as distinct if the attribute values within the sub-trees of the first and second metadata 
summaries are not equivalent. 

15. (Canceled) 

16. (Canceled) 

17. (Currently Amended) A program product for use in a computer system that executes 
program steps recorded in a computer-readable media to perform a method for classifying 
electronically posted documents, the program product comprising: 

a record-able media; 
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a program of computer-readable instructions executable by the computer system to 
perform processes comprising the steps of : 

receiving a first document and a second document; 

generating a first metadata summary cor r esponding to for said first document and 
a second metadata summary corr e sponding to for the second document, wherein the first 
metadata summary includes a first summary sub-tree plurality of sub-trees and the second 
metadata summary includes a second summary sub-t r ee plurality of sub-trees, and 
wherein a sub-t r ee each of the sub-trees includes a plurality of list items nodes ; 

comparing the first and second metadata summaries on a structural level by 
comparing the list items a structure of the sub-trees of the first metadata summary sub- 
tree with th e list items a structure of the sub-trees of the second metadata summary sub- 
tree; and 

identifying the first and second documents as distinct if the list items structures of 
the sub-trees of the first and second summary sub-t ree s metadata summaries are not 
equivalent; 

if the structures of the subtrees of the first and second metadata summaries are 
equivalent, performing a further comparison of the first and second metadata summaries, 

wherein the further comparison of the first and second metadata summaries 
includes the sub-steps of: 

comparing the first and second metadata summaries on a textual level by 
comparing textual content from the first document that is contained in the sub- 
trees of the first metadata summary with textual content from the second 
document that is contained in the sub-trees of the second metadata summary: and 
identifying the first and second documents as distinct if the textual content 
within the sub-trees of the first and second metadata summaries are not 
equivalent . 
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1 8. (Currently Amended) The program product of claim 1 7, wherein each list item includes 
at least on e attribute having an attribute value, the program p r oduct method further comparison 
of the first and second metadata summaries further comprising the processes includes the sub- 
steps of : 

before comparing the first and second metadata summaries on a textual leveL comparing 
the first and second metadata summaries on an attribute level by comparing the attribute value of 
a list item values within the sub-trees of the first metadata summary sub-t r e e with the attribute 
value of a list item values within the sub-trees of the second metadata summary sub-tree ; and 

identifying the first and second documents as distinct if the attribute values within the 
sub-trees of the first and second summary sub-t ree s metadata summaries are not equivalent. 

19. (Canceled) 

20. (Currently Amended) The program product of claim +9 17, further comprising the 
method step of identifying the first and second documents as duplicates if the text textual content 
within the list items sub-trees of the first and second summary sub-trees metadata summaries are 
equivalent. 

21 . (Currently Amended) The program product of claim 20, further comprising the process 
step of removing the second metadata summary if the first and second documents are identified 
as duplicates . 



22-24. (Canceled) 
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25. (Currently Amended) A program product for use in a computer system that executes 
program steps recorded in a computer-readable media to perform a method for classifying 
electronically posted documents, the program product comprising: 
a record-able media; 

a program of computer-readable instructions executable by the computer system to 
perform method steps comprising: 

receiving a plurality of documents; 

generating a respective plu r ality of metadata summaries co rr es p onding to 
summary for each of the plurality of received documents , wherein each of the metadata 
summaries includes a plurality of sub-trees and each of the sub-trees includes a plurality 
of nodes : 

grouping a first subset of the r e spectiv e plurality of metadata summaries onto into 
a first summary group, the first summary group comprising consisting of all of the 
metadata summaries having a first mime-type designation; 

selecting a first metadata summary and a second metadata summary from the first 
summary group , wherein th e first metadata summary includes a fi r st summary sub-t r ee 
and the second metadata summary includ e s a second summary sub-t re e and wh e rein a 
sub-tr e e includ e s a plurality of list it e ms ; 

comparing the first and second metadata summaries on a structural level by 
comparing the list items a structure of the sub-trees of the first metadata summary sub- 
tree with the list items a structure of the sub-trees of the second metadata summary sub- 
tree; and 

identifying the first and second documents as distinct if the list it e ms structures of 
the sub-trees of the first and second summary sub-t r e e s metadata summaries are not 
equivalent. 
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26. (Currently Amended) The program product of claim 25, wherein the step of grouping 
further comprises grouping a second subset of the respective metadata summaries into a second 
summary group, the second summary group comprising consisting of all of the metadata 
summaries having a second mime-type designation. 



